Reward Shaping and Mixed Resolution Function Approximation

نویسنده

Marek Grzes

چکیده

In contrast to supervised learning, RL agents are not given instructive feedback on what the best decision in a particular situation is. This leads to the temporal credit assignment problem, that is, the problem of determining which part of the behaviour deserves the reward (Sutton, 1984). To address this issue, the iterative approach to RL applies backpropagation of the value function in the state space. Because this is a delayed, iterative technique, it usually leads to a slow convergence, especially when the state space is huge. In fact, the state space grows exponentially with each variable added to the encoding of the environment when the Markov property needs to be preserved (Sutton & Barto, 1998). When the state space is huge, the tabular representation of the value function with a separate entry for each state or state-action pair becomes ABSTRACT

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation

A crucial trade-off is involved in the design process when function approximation is used in reinforcement learning. Ideally the chosen representation should allow representing as close as possible an approximation of the value function. However, the more expressive the representation the more training data is needed because the space of candidate hypotheses is bigger. A less expressive represe...

متن کامل

Abstract MDP Reward Shaping for Multi-Agent Reinforcement Learning

MDP Reward Shaping for Multi-Agent Reinforcement Learning Kyriakos Efthymiadis, Sam Devlin and Daniel Kudenko Department of Computer Science, The University of York, UK Abstract. Reward shaping has been shown to significantly improve an agent’s performance in reinforcement learning. As attention is shifting from tabula-rasa approaches to methods where some heuristic domain knowledge can be give...

متن کامل

Reward Shaping for Statistical Optimisation of Dialogue Management

This paper investigates the impact of reward shaping on a reinforcement learning-based spoken dialogue system’s learning. A diffuse reward function gives a reward after each transition between two dialogue states. A sparse function only gives a reward at the end of the dialogue. Reward shaping consists of learning a diffuse function without modifying the optimal policy compared to a sparse one....

متن کامل

Multiagent Learning with a Noisy Global Reward Signal

Scaling multiagent reinforcement learning to domains with many agents is a complex problem. In particular, multiagent credit assignment becomes a key issue as the system size increases. Some multiagent systems suffer from a global reward signal that is very noisy or difficult to analyze. This makes deriving a learnable local reward signal very difficult. Difference rewards (a particular instanc...

متن کامل

Imitation in Reinforcement Learning

The promise of imitation is to facilitate learning by allowing the learner to observe a teacher in action. Ideally this will lead to faster learning when the expert knows an optimal policy. Imitating a suboptimal teacher may slow learning, but it should not prevent the student from surpassing the teacher’s performance in the long run. Several researchers have looked at imitation in the context ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Reward Shaping and Mixed Resolution Function Approximation

نویسنده

چکیده

منابع مشابه

Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation

Abstract MDP Reward Shaping for Multi-Agent Reinforcement Learning

Reward Shaping for Statistical Optimisation of Dialogue Management

Multiagent Learning with a Noisy Global Reward Signal

Imitation in Reinforcement Learning

عنوان ژورنال:

اشتراک گذاری